-
Notifications
You must be signed in to change notification settings - Fork 14.4k
[TTI] Don't drop VP intrinsic args when delegating to non-vp equivalent #147677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Previously we only carried the type arguments which caused value-based costs to be inadvertantly changed into type-based costs. I'm just using vp.is.fpclass as an example intrinsic for now since the type based cost seems to differ from the value based cost, and most normal intrinsics e.g. min/max have the same value + type based cost. We still need to handle the cost properly for is.fpclass in a second patch. This is needed for an upcoming patch to handle the cost of llvm.experimental.vp.reverse which suffers from the same problem.
@llvm/pr-subscribers-llvm-analysis Author: Luke Lau (lukel97) ChangesPreviously we only carried the type arguments which caused value-based costs to be inadvertantly changed into type-based costs. I'm just using vp.is.fpclass as an example intrinsic for now since the type based cost seems to differ from the value based cost, and most normal intrinsics e.g. min/max have the same value + type based cost. We still need to handle the cost properly for is.fpclass in a second patch. This is needed for an upcoming patch to handle the cost of llvm.experimental.vp.reverse which suffers from the same problem. Full diff: https://github.com/llvm/llvm-project/pull/147677.diff 2 Files Affected:
diff --git a/llvm/include/llvm/CodeGen/BasicTTIImpl.h b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
index 2b9be43eadb7a..21da0311fa434 100644
--- a/llvm/include/llvm/CodeGen/BasicTTIImpl.h
+++ b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
@@ -1781,6 +1781,10 @@ class BasicTTIImplBase : public TargetTransformInfoImplCRTPBase<T> {
assert(ICA.getArgTypes().size() >= 2 &&
"Expected VPIntrinsic to have Mask and Vector Length args and "
"types");
+
+ ArrayRef<const Value *> NewArgs = ArrayRef(ICA.getArgs());
+ if (!ICA.isTypeBasedOnly())
+ NewArgs = NewArgs.drop_back(2);
ArrayRef<Type *> NewTys = ArrayRef(ICA.getArgTypes()).drop_back(2);
// VPReduction intrinsics have a start value argument that their non-vp
@@ -1788,11 +1792,14 @@ class BasicTTIImplBase : public TargetTransformInfoImplCRTPBase<T> {
// counterpart.
if (VPReductionIntrinsic::isVPReduction(ICA.getID()) &&
*FID != Intrinsic::vector_reduce_fadd &&
- *FID != Intrinsic::vector_reduce_fmul)
+ *FID != Intrinsic::vector_reduce_fmul) {
+ if (!ICA.isTypeBasedOnly())
+ NewArgs = NewArgs.drop_front();
NewTys = NewTys.drop_front();
+ }
- IntrinsicCostAttributes NewICA(*FID, ICA.getReturnType(), NewTys,
- ICA.getFlags());
+ IntrinsicCostAttributes NewICA(*FID, ICA.getReturnType(), NewArgs,
+ NewTys, ICA.getFlags());
return thisT()->getIntrinsicInstrCost(NewICA, CostKind);
}
}
diff --git a/llvm/test/Analysis/CostModel/RISCV/vp-intrinsics.ll b/llvm/test/Analysis/CostModel/RISCV/vp-intrinsics.ll
index ea3c47dc34201..ad239a511d747 100644
--- a/llvm/test/Analysis/CostModel/RISCV/vp-intrinsics.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/vp-intrinsics.ll
@@ -1648,3 +1648,109 @@ define void @splice() {
%splice_nxv2i1 = call <vscale x 2 x i1> @llvm.experimental.vp.splice.nxv2i1(<vscale x 2 x i1> zeroinitializer, <vscale x 2 x i1> zeroinitializer, i32 1, <vscale x 2 x i1> zeroinitializer, i32 poison, i32 poison)
ret void
}
+
+define void @is.fpclass() {
+; ARGBASED-LABEL: 'is.fpclass'
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %1 = call <2 x i1> @llvm.vp.is.fpclass.v2bf16(<2 x bfloat> undef, i32 0, <2 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %2 = call <4 x i1> @llvm.vp.is.fpclass.v4bf16(<4 x bfloat> undef, i32 0, <4 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 18 for instruction: %3 = call <8 x i1> @llvm.vp.is.fpclass.v8bf16(<8 x bfloat> undef, i32 0, <8 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 34 for instruction: %4 = call <16 x i1> @llvm.vp.is.fpclass.v16bf16(<16 x bfloat> undef, i32 0, <16 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %5 = call <2 x i1> @llvm.vp.is.fpclass.v2f16(<2 x half> undef, i32 0, <2 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %6 = call <4 x i1> @llvm.vp.is.fpclass.v4f16(<4 x half> undef, i32 0, <4 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 18 for instruction: %7 = call <8 x i1> @llvm.vp.is.fpclass.v8f16(<8 x half> undef, i32 0, <8 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 34 for instruction: %8 = call <16 x i1> @llvm.vp.is.fpclass.v16f16(<16 x half> undef, i32 0, <16 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %9 = call <2 x i1> @llvm.vp.is.fpclass.v2f32(<2 x float> undef, i32 0, <2 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %10 = call <4 x i1> @llvm.vp.is.fpclass.v4f32(<4 x float> undef, i32 0, <4 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 18 for instruction: %11 = call <8 x i1> @llvm.vp.is.fpclass.v8f32(<8 x float> undef, i32 0, <8 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 34 for instruction: %12 = call <16 x i1> @llvm.vp.is.fpclass.v16f32(<16 x float> undef, i32 0, <16 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 6 for instruction: %13 = call <2 x i1> @llvm.vp.is.fpclass.v2f64(<2 x double> undef, i32 0, <2 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %14 = call <4 x i1> @llvm.vp.is.fpclass.v4f64(<4 x double> undef, i32 0, <4 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 18 for instruction: %15 = call <8 x i1> @llvm.vp.is.fpclass.v8f64(<8 x double> undef, i32 0, <8 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 34 for instruction: %16 = call <16 x i1> @llvm.vp.is.fpclass.v16f64(<16 x double> undef, i32 0, <16 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Invalid cost for instruction: %17 = call <vscale x 2 x i1> @llvm.vp.is.fpclass.nxv2bf16(<vscale x 2 x bfloat> undef, i32 0, <vscale x 2 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Invalid cost for instruction: %18 = call <vscale x 4 x i1> @llvm.vp.is.fpclass.nxv4bf16(<vscale x 4 x bfloat> undef, i32 0, <vscale x 4 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Invalid cost for instruction: %19 = call <vscale x 8 x i1> @llvm.vp.is.fpclass.nxv8bf16(<vscale x 8 x bfloat> undef, i32 0, <vscale x 8 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Invalid cost for instruction: %20 = call <vscale x 16 x i1> @llvm.vp.is.fpclass.nxv16bf16(<vscale x 16 x bfloat> undef, i32 0, <vscale x 16 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Invalid cost for instruction: %21 = call <vscale x 2 x i1> @llvm.vp.is.fpclass.nxv2f16(<vscale x 2 x half> undef, i32 0, <vscale x 2 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Invalid cost for instruction: %22 = call <vscale x 4 x i1> @llvm.vp.is.fpclass.nxv4f16(<vscale x 4 x half> undef, i32 0, <vscale x 4 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Invalid cost for instruction: %23 = call <vscale x 8 x i1> @llvm.vp.is.fpclass.nxv8f16(<vscale x 8 x half> undef, i32 0, <vscale x 8 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Invalid cost for instruction: %24 = call <vscale x 16 x i1> @llvm.vp.is.fpclass.nxv16f16(<vscale x 16 x half> undef, i32 0, <vscale x 16 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Invalid cost for instruction: %25 = call <vscale x 2 x i1> @llvm.vp.is.fpclass.nxv2f32(<vscale x 2 x float> undef, i32 0, <vscale x 2 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Invalid cost for instruction: %26 = call <vscale x 4 x i1> @llvm.vp.is.fpclass.nxv4f32(<vscale x 4 x float> undef, i32 0, <vscale x 4 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Invalid cost for instruction: %27 = call <vscale x 8 x i1> @llvm.vp.is.fpclass.nxv8f32(<vscale x 8 x float> undef, i32 0, <vscale x 8 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Invalid cost for instruction: %28 = call <vscale x 16 x i1> @llvm.vp.is.fpclass.nxv16f32(<vscale x 16 x float> undef, i32 0, <vscale x 16 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Invalid cost for instruction: %29 = call <vscale x 2 x i1> @llvm.vp.is.fpclass.nxv2f64(<vscale x 2 x double> undef, i32 0, <vscale x 2 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Invalid cost for instruction: %30 = call <vscale x 4 x i1> @llvm.vp.is.fpclass.nxv4f64(<vscale x 4 x double> undef, i32 0, <vscale x 4 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Invalid cost for instruction: %31 = call <vscale x 8 x i1> @llvm.vp.is.fpclass.nxv8f64(<vscale x 8 x double> undef, i32 0, <vscale x 8 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Invalid cost for instruction: %32 = call <vscale x 16 x i1> @llvm.vp.is.fpclass.nxv16f64(<vscale x 16 x double> undef, i32 0, <vscale x 16 x i1> undef, i32 undef)
+; ARGBASED-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; TYPEBASED-LABEL: 'is.fpclass'
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %1 = call <2 x i1> @llvm.vp.is.fpclass.v2bf16(<2 x bfloat> undef, i32 0, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %2 = call <4 x i1> @llvm.vp.is.fpclass.v4bf16(<4 x bfloat> undef, i32 0, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 33 for instruction: %3 = call <8 x i1> @llvm.vp.is.fpclass.v8bf16(<8 x bfloat> undef, i32 0, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 65 for instruction: %4 = call <16 x i1> @llvm.vp.is.fpclass.v16bf16(<16 x bfloat> undef, i32 0, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %5 = call <2 x i1> @llvm.vp.is.fpclass.v2f16(<2 x half> undef, i32 0, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %6 = call <4 x i1> @llvm.vp.is.fpclass.v4f16(<4 x half> undef, i32 0, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 33 for instruction: %7 = call <8 x i1> @llvm.vp.is.fpclass.v8f16(<8 x half> undef, i32 0, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 65 for instruction: %8 = call <16 x i1> @llvm.vp.is.fpclass.v16f16(<16 x half> undef, i32 0, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %9 = call <2 x i1> @llvm.vp.is.fpclass.v2f32(<2 x float> undef, i32 0, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %10 = call <4 x i1> @llvm.vp.is.fpclass.v4f32(<4 x float> undef, i32 0, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 33 for instruction: %11 = call <8 x i1> @llvm.vp.is.fpclass.v8f32(<8 x float> undef, i32 0, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 65 for instruction: %12 = call <16 x i1> @llvm.vp.is.fpclass.v16f32(<16 x float> undef, i32 0, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %13 = call <2 x i1> @llvm.vp.is.fpclass.v2f64(<2 x double> undef, i32 0, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %14 = call <4 x i1> @llvm.vp.is.fpclass.v4f64(<4 x double> undef, i32 0, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 33 for instruction: %15 = call <8 x i1> @llvm.vp.is.fpclass.v8f64(<8 x double> undef, i32 0, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 65 for instruction: %16 = call <16 x i1> @llvm.vp.is.fpclass.v16f64(<16 x double> undef, i32 0, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: %17 = call <vscale x 2 x i1> @llvm.vp.is.fpclass.nxv2bf16(<vscale x 2 x bfloat> undef, i32 0, <vscale x 2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: %18 = call <vscale x 4 x i1> @llvm.vp.is.fpclass.nxv4bf16(<vscale x 4 x bfloat> undef, i32 0, <vscale x 4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: %19 = call <vscale x 8 x i1> @llvm.vp.is.fpclass.nxv8bf16(<vscale x 8 x bfloat> undef, i32 0, <vscale x 8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: %20 = call <vscale x 16 x i1> @llvm.vp.is.fpclass.nxv16bf16(<vscale x 16 x bfloat> undef, i32 0, <vscale x 16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: %21 = call <vscale x 2 x i1> @llvm.vp.is.fpclass.nxv2f16(<vscale x 2 x half> undef, i32 0, <vscale x 2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: %22 = call <vscale x 4 x i1> @llvm.vp.is.fpclass.nxv4f16(<vscale x 4 x half> undef, i32 0, <vscale x 4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: %23 = call <vscale x 8 x i1> @llvm.vp.is.fpclass.nxv8f16(<vscale x 8 x half> undef, i32 0, <vscale x 8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: %24 = call <vscale x 16 x i1> @llvm.vp.is.fpclass.nxv16f16(<vscale x 16 x half> undef, i32 0, <vscale x 16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: %25 = call <vscale x 2 x i1> @llvm.vp.is.fpclass.nxv2f32(<vscale x 2 x float> undef, i32 0, <vscale x 2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: %26 = call <vscale x 4 x i1> @llvm.vp.is.fpclass.nxv4f32(<vscale x 4 x float> undef, i32 0, <vscale x 4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: %27 = call <vscale x 8 x i1> @llvm.vp.is.fpclass.nxv8f32(<vscale x 8 x float> undef, i32 0, <vscale x 8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: %28 = call <vscale x 16 x i1> @llvm.vp.is.fpclass.nxv16f32(<vscale x 16 x float> undef, i32 0, <vscale x 16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: %29 = call <vscale x 2 x i1> @llvm.vp.is.fpclass.nxv2f64(<vscale x 2 x double> undef, i32 0, <vscale x 2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: %30 = call <vscale x 4 x i1> @llvm.vp.is.fpclass.nxv4f64(<vscale x 4 x double> undef, i32 0, <vscale x 4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: %31 = call <vscale x 8 x i1> @llvm.vp.is.fpclass.nxv8f64(<vscale x 8 x double> undef, i32 0, <vscale x 8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Invalid cost for instruction: %32 = call <vscale x 16 x i1> @llvm.vp.is.fpclass.nxv16f64(<vscale x 16 x double> undef, i32 0, <vscale x 16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+ call <2 x i1> @llvm.vp.is.fpclass.v2bf16(<2 x bfloat> undef, i32 0, <2 x i1> undef, i32 undef)
+ call <4 x i1> @llvm.vp.is.fpclass.v4bf16(<4 x bfloat> undef, i32 0, <4 x i1> undef, i32 undef)
+ call <8 x i1> @llvm.vp.is.fpclass.v8bf16(<8 x bfloat> undef, i32 0, <8 x i1> undef, i32 undef)
+ call <16 x i1> @llvm.vp.is.fpclass.v16bf16(<16 x bfloat> undef, i32 0, <16 x i1> undef, i32 undef)
+ call <2 x i1> @llvm.vp.is.fpclass.v2f16(<2 x half> undef, i32 0, <2 x i1> undef, i32 undef)
+ call <4 x i1> @llvm.vp.is.fpclass.v4f16(<4 x half> undef, i32 0, <4 x i1> undef, i32 undef)
+ call <8 x i1> @llvm.vp.is.fpclass.v8f16(<8 x half> undef, i32 0, <8 x i1> undef, i32 undef)
+ call <16 x i1> @llvm.vp.is.fpclass.v16f16(<16 x half> undef, i32 0, <16 x i1> undef, i32 undef)
+ call <2 x i1> @llvm.vp.is.fpclass.v2f32(<2 x float> undef, i32 0, <2 x i1> undef, i32 undef)
+ call <4 x i1> @llvm.vp.is.fpclass.v4f32(<4 x float> undef, i32 0, <4 x i1> undef, i32 undef)
+ call <8 x i1> @llvm.vp.is.fpclass.v8f32(<8 x float> undef, i32 0, <8 x i1> undef, i32 undef)
+ call <16 x i1> @llvm.vp.is.fpclass.v16f32(<16 x float> undef, i32 0, <16 x i1> undef, i32 undef)
+ call <2 x i1> @llvm.vp.is.fpclass.v2f64(<2 x double> undef, i32 0, <2 x i1> undef, i32 undef)
+ call <4 x i1> @llvm.vp.is.fpclass.v4f64(<4 x double> undef, i32 0, <4 x i1> undef, i32 undef)
+ call <8 x i1> @llvm.vp.is.fpclass.v8f64(<8 x double> undef, i32 0, <8 x i1> undef, i32 undef)
+ call <16 x i1> @llvm.vp.is.fpclass.v16f64(<16 x double> undef, i32 0, <16 x i1> undef, i32 undef)
+ call <vscale x 2 x i1> @llvm.vp.is.fpclass.nxv2bf16(<vscale x 2 x bfloat> undef, i32 0, <vscale x 2 x i1> undef, i32 undef)
+ call <vscale x 4 x i1> @llvm.vp.is.fpclass.nxv4bf16(<vscale x 4 x bfloat> undef, i32 0, <vscale x 4 x i1> undef, i32 undef)
+ call <vscale x 8 x i1> @llvm.vp.is.fpclass.nxv8bf16(<vscale x 8 x bfloat> undef, i32 0, <vscale x 8 x i1> undef, i32 undef)
+ call <vscale x 16 x i1> @llvm.vp.is.fpclass.nxv16bf16(<vscale x 16 x bfloat> undef, i32 0, <vscale x 16 x i1> undef, i32 undef)
+ call <vscale x 2 x i1> @llvm.vp.is.fpclass.nxv2f16(<vscale x 2 x half> undef, i32 0, <vscale x 2 x i1> undef, i32 undef)
+ call <vscale x 4 x i1> @llvm.vp.is.fpclass.nxv4f16(<vscale x 4 x half> undef, i32 0, <vscale x 4 x i1> undef, i32 undef)
+ call <vscale x 8 x i1> @llvm.vp.is.fpclass.nxv8f16(<vscale x 8 x half> undef, i32 0, <vscale x 8 x i1> undef, i32 undef)
+ call <vscale x 16 x i1> @llvm.vp.is.fpclass.nxv16f16(<vscale x 16 x half> undef, i32 0, <vscale x 16 x i1> undef, i32 undef)
+ call <vscale x 2 x i1> @llvm.vp.is.fpclass.nxv2f32(<vscale x 2 x float> undef, i32 0, <vscale x 2 x i1> undef, i32 undef)
+ call <vscale x 4 x i1> @llvm.vp.is.fpclass.nxv4f32(<vscale x 4 x float> undef, i32 0, <vscale x 4 x i1> undef, i32 undef)
+ call <vscale x 8 x i1> @llvm.vp.is.fpclass.nxv8f32(<vscale x 8 x float> undef, i32 0, <vscale x 8 x i1> undef, i32 undef)
+ call <vscale x 16 x i1> @llvm.vp.is.fpclass.nxv16f32(<vscale x 16 x float> undef, i32 0, <vscale x 16 x i1> undef, i32 undef)
+ call <vscale x 2 x i1> @llvm.vp.is.fpclass.nxv2f64(<vscale x 2 x double> undef, i32 0, <vscale x 2 x i1> undef, i32 undef)
+ call <vscale x 4 x i1> @llvm.vp.is.fpclass.nxv4f64(<vscale x 4 x double> undef, i32 0, <vscale x 4 x i1> undef, i32 undef)
+ call <vscale x 8 x i1> @llvm.vp.is.fpclass.nxv8f64(<vscale x 8 x double> undef, i32 0, <vscale x 8 x i1> undef, i32 undef)
+ call <vscale x 16 x i1> @llvm.vp.is.fpclass.nxv16f64(<vscale x 16 x double> undef, i32 0, <vscale x 16 x i1> undef, i32 undef)
+ ret void
+}
|
✅ With the latest revision this PR passed the undef deprecator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll get to this in a bit, but first, do a s/undef/poison/?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confused; questions follow.
A experimental_vp_reverse isn't exactly functionally the same as vector_reverse, so previously it wasn't getting picked up by the generic VP costing code that reuses the non-VP equivalents. But for costing purposes it's good enough so we can reuse it. The RISC-V costs are still incorrect and are showing up as scalarized. llvm#147677 aims to fix part of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Just for curious, do you know which part of cost for is_fpclass
make the type-based different from value-based query?
I think it's to do with how |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
A experimental_vp_reverse isn't exactly functionally the same as vector_reverse, so previously it wasn't getting picked up by the generic VP costing code that reuses the non-VP equivalents. But for costing purposes it's good enough so we can reuse it. The RISC-V costs are still incorrect and are showing up as scalarized. llvm#147677 aims to fix part of this.
This is causing a compiler crash when building the llvm-test-suite in the rva23-evl configuration https://lab.llvm.org/staging/#/builders/210/builds/988 |
Previously we only carried the type arguments which caused value-based costs to be inadvertantly changed into type-based costs.
I'm just using vp.is.fpclass as an example intrinsic for now since the type based cost seems to differ from the value based cost, and most normal intrinsics e.g. min/max have the same value + type based cost.
We still need to handle the cost properly for is.fpclass in a second patch.
This is needed for an upcoming patch to handle the cost of llvm.experimental.vp.reverse which suffers from the same problem.